Search Result

Select

Incremental learning based proactive caching mechanism for RocksDB key-value system

Keyun LUO, Baoliu YE, Bin TANG, Feng MEI, Wenda LU

Journal of Computer Applications 2020, 40 (2): 321-327. DOI: 10.11772/j.issn.1001-9081.2019091616

Abstract （407）

HTML （2）

PDF （723KB）（356）

Save

RocksDB key-value storage system based on Log-Structured Merge （LSM） tree has the problem of low read performance caused by the constraints of its hierarchical structure. One effective solution is to cache hot spot data proactively， but it faces two challenges. One is how to predict the hot spot data when the data distribution keeps on changing constantly， the other is how to integrate the proactive caching mechanism with the RocksDB storage structure. To tackle these challenges， a proactive caching framework for RocksDB key-value system with multiple components including data collection， system interaction and system evaluation was built， which can cache the hot spot data at the low levels of the LSM tree. And with the modeling of data access patterns， an incremental learning based prediction analysis method for hot spot data was designed and implemented， which can reduce the number of I/O operations of storage medium. Experimental results show that the proposed mechanism can effectively improve the read performance of RocksDB under different dynamic workloads.

Table and Figures | Reference | Related Articles | Metrics

Select

High performance key-value storage system based on remote direct memory access

Cheng WANG, Baoliu YE, Feng MEI, Wenda LU

Journal of Computer Applications 2020, 40 (2): 316-320. DOI: 10.11772/j.issn.1001-9081.2019091635

Abstract （380）

HTML （4）

PDF （613KB）（565）

Save

With the continuous increment of data and system size， network communication becomes a performance bottleneck of key-value storage systems. Meanwhile， Remote Direct Memory Access （RDMA） technique can support high bandwidth， low latency data transmission， which provides a new idea for designing key-value storage systems. Based on RDMA technique in the high performance network， a key-value storage system named Chequer with high performance and low CPU overhead was designed and implemented. By combining the characteristics of RDMA primitives， the basic operation workflow of key-value storage system was redesigned. And a linear probing based shared hash table was designed to reduce the number of client reading rounds by solving the problem of client cache invalidation as well as increasing the hash hit rate， which can further improve the performance of the system. The Chequer system was implemented on the small-scale cluster， and its performance was demonstrated by experiments.

Table and Figures | Reference | Related Articles | Metrics

Select

Efficient storage scheme for deadline aware distributed matrix multiplication

Yongzhu ZHAO, Weidong LI, Bin TANG, Feng MEI, Wenda LU

Journal of Computer Applications 2020, 40 (2): 311-315. DOI: 10.11772/j.issn.1001-9081.2019091640

Abstract （457）

HTML （15）

PDF （742KB）（543）

Save

Distributed matrix multiplication is a fundamental operation in many distributed machine learning and scientific computing applications， but its performance is greatly influenced by the stragglers commonly existed in the systems. Recently， researchers have proposed a fountain code based coded matrix multiplication method， which can effectively mitigate the effect of stragglers by fully exploiting the partial results of stragglers. However， it lacks the consideration of the storage cost of worker nodes. By considering the tradeoff relationship between the storage cost and the finish time of computation， the computational deadline-aware storage optimization problem for heterogeneous worker nodes was proposed firstly. Then， through the theoretical analysis， the solution based on expectation approximation was presented， and the problem was transformed into a convex optimization problem by relaxation for efficient solution. Simulation results show that in the case of ensuring a large task success rate， the storage overhead of the proposed scheme will rapidly decrease as the task duration is relaxed， and the scheme can greatly reduce the storage overhead brought by encoding. In other words， the proposed scheme can significantly reduce the extra storage overhead while guaranteeing that the whole computation can be finished before the deadline with high probability.

Table and Figures | Reference | Related Articles | Metrics